-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor and a couple of fixes for adapter layer updates #1268
Refactor and a couple of fixes for adapter layer updates #1268
Conversation
For LoRA, so far, we have update_layer for Linear, update_layer_embedding for Embedding, and update_layer_conv2d for Conv2d, all defined on LoraLayer. We can simplify the code by always using the name update_layer, and by moving the layer-specific methods to the subclasses. So e.g. update_layer_embedding is moved to the Embedding class and renamed to update_layer. This way, the caller does not need to differentiate which type of layer it's calling. Interestingly, this was already practiced for IA³, so the same change was not necessary there. But I did find the same method implemented twice, once on IA3Layer and once on Linear, so I removed one of the duplicates
Always raise an error when r <= 0, not only for LoRA. Also, removed later check for r > 0 in LoRA layers, since we already check for r <= 0.
Was indented too deep, thus not being applied.
Before this fix, when adding a 2nd adapter to a model, we did not correctly check if there was already an adapter layer in the model when dealing with LoRA GPTQ or IA3 bnb layers. As a consequence, instead of updating the existing layers, we would create a new layer and the existing layer would be set as the base_layer of that new layer. Now, we correctly update the existing layer to add the new adapter. Note that for this fix to work correctly with LoRA and GPTQ, I had to add a check for qweight, since we only checked for weight before. Tests were added to check this. They fail with the current main but are fixed with this PR.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
AdaLoraLayer is a subclass of LoraLayer, so just checking for isinstance(target, LoraLayer) will match AdaLoraLayer, which we don't want when it comes to updating a LoraLayer. Now, we explicitly check that the layer is *not* an instance of AdaLoraLayer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @BenjaminBossan for fixing many bugs while simplify a lot of code. LGTM! 🔥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks ! Can you double check integration tests pass, just in case?
@younesbelkada Integration tests don't work on forks, right? So we can check after merging only. |
I just ran it here: https://github.com/huggingface/peft/actions/runs/7246319576 |
Tests are green ! Feel free to merge! |
This started as a simple refactor, but during this, I discovered a few issues too, which should now be fixed. This is probably best reviewed by checking each commit separately.
Here are the descriptions for the individual changes:
1. Refactor: Move LoRA
update_layer
to child classesFor LoRA, so far, we have
update_layer
forLinear
,update_layer_embedding
forEmbedding
, andupdate_layer_conv2d
for Conv2d, all defined onLoraLayer
.We can simplify the code by always using the name
update_layer
, and by moving the layer-specific methods to the subclasses. So e.g.update_layer_embedding
is moved to theEmbedding
class and renamed toupdate_layer
. This way, the caller does not need to differentiate which type of layer it's calling, making it much easier (see the simplification inLoraModel._create_and_replace
). It also makes the code less error prone (see change 4 described below).Interestingly, this was already practiced for IA³, so the same change was not necessary there. But I did find the same method implemented twice, once on
IA3Layer
and once onLinear
, so I removed one of the duplicates. For all other adapter methods, no change was required, as they only implementLinear
.Note: This could technically be backwards incompatible if users do some custom stuff with
LoraLayer
s. I could add a note for next release that allupdate_layer_*
methods have been consolidated to use the sameupdate_layer
name.2. Systematic handling of
r
(rank) <= 0Always raise an error when
r <= 0
, not only for LoRA. Also, removed later check forr > 0
in LoRA layers, since we already check forr <= 0
.Note: This could technically also be considered backwards incompatible, but
r<=0
should not work correctly anyway, so better to raise an error right away.3. Fix broken
__repr__
method onQuantLinear
Was indented too deep, thus not being applied.
4. Fix bug for updating Lora GPTQ and IA3 bnb layers
Before this fix, when adding a 2nd adapter to a model, we did not correctly check if there was already an adapter layer in the model when dealing with LoRA GPTQ or IA3 bnb layers. As a consequence, instead of updating the existing layers, we would create a new layer and the existing layer would be set as the
base_layer
of that new layer. Now, we correctly update the existing layer to add the new adapter.For this fix to work correctly with LoRA and GPTQ, I had to add a check for
qweight
inupdate_layer
, since we only checked forweight
before. This was a mistake that didn't surface so far because of the error described in the previous paragraph.Tests were extended to check this. They fail with the current main but pass on my machine with this PR.